4 research outputs found
Recommended from our members
Polynomials and models of type theory
This thesis studies the structure of categories of polynomials, the diagrams that represent polynomial functors. Specifically, we construct new models of intensional dependent type theory based on these categories.
Firstly, we formalize the conceptual viewpoint that polynomials are built out of sums and products.
Polynomial functors make sense in a category when there exist pseudomonads freely adding indexed sums and products to fibrations over the category, and a category of polynomials is obtained by adding sums to the opposite of the codomain fibration.
A fibration with sums and products is essentially the structure defining a categorical model of dependent type theory. For such a model the base category of the fibration should also be identified with the fibre over the terminal object. Since adding sums
does not preserve this property, we are led to consider a general method for building new models of type theory from old ones, by first performing a fibrewise construction
and then extending the base.
Applying this method to the polynomial construction, we show that given a fibration with sufficient structure modelling type theory,
there is a new model in a category of polynomials.
The key result is establishing that although the base category is not locally cartesian closed, this model has dependent product types.
Finally, we investigate the properties of identity types in this model, and consider the
link with functional interpretations in logic
Partially-Static Data as Free Extension of Algebras
Partially-static data structures are a well-known technique for improving binding times. However, they are often defined in an ad-hoc manner, without a unifying framework to ensure full use of the equations associated with each operation. We present a foundational view of partially-static data structures as free extensions of algebras for suitable equational theories, i.e. the coproduct of an algebra and a free algebra in the category of algebras and their homomorphisms. By precalculating these free extensions, we construct a high-level library of partially-static data representations for common algebraic structures. We demonstrate our library with common use-cases from the literature: string and list manipulation, linear algebra, and numerical simplification.Supported by the European Research Council grant ‘events causality and symmetry Ð the next- generation semantics’; the Engineering and Physical Sciences Research Council grant EP/N007387/1 ‘Quantum computation as a programming language’, and a Balliol College Oxford Career Development Fellowshi
Asynchronous Algorithmic Alignment with Cocycles
State-of-the-art neural algorithmic reasoners make use of message passing in
graph neural networks (GNNs). But typical GNNs blur the distinction between the
definition and invocation of the message function, forcing a node to send
messages to its neighbours at every layer, synchronously. When applying GNNs to
learn to execute dynamic programming algorithms, however, on most steps only a
handful of the nodes would have meaningful updates to send. One, hence, runs
the risk of inefficiencies by sending too much irrelevant data across the graph
-- with many intermediate GNN steps having to learn identity functions. In this
work, we explicitly separate the concepts of node state update and message
function invocation. With this separation, we obtain a mathematical formulation
that allows us to reason about asynchronous computation in both algorithms and
neural networks
Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
An important goal in artificial intelligence is to create agents that can
both interact naturally with humans and learn from their feedback. Here we
demonstrate how to use reinforcement learning from human feedback (RLHF) to
improve upon simulated, embodied agents trained to a base level of competency
with imitation learning. First, we collected data of humans interacting with
agents in a simulated 3D world. We then asked annotators to record moments
where they believed that agents either progressed toward or regressed from
their human-instructed goal. Using this annotation data we leveraged a novel
method - which we call "Inter-temporal Bradley-Terry" (IBT) modelling - to
build a reward model that captures human judgments. Agents trained to optimise
rewards delivered from IBT reward models improved with respect to all of our
metrics, including subsequent human judgment during live interactions with
agents. Altogether our results demonstrate how one can successfully leverage
human judgments to improve agent behaviour, allowing us to use reinforcement
learning in complex, embodied domains without programmatic reward functions.
Videos of agent behaviour may be found at https://youtu.be/v_Z9F2_eKk4